Goto

Collaborating Authors

 gaussian noise


00482b9bed15a272730fcb590ffebddd-Supplemental.pdf

Neural Information Processing Systems

A.1 Training dataset pre-processing We used 40000publicly available videos from YouTube which were available in a spatial resolution of at least 1920 1080 pixels. In an attempt not to skew the distribution of content too far from what may inform biological representation learning, we excluded most artificial content such as screenshots and videos of computer games. To reduce video compression artifacts and prevent systematic downsampling artifacts, each segment was then spatially downsampled to a randomized height between 128 and 160. Each segment was then separated into 15 pairs of neighboring frames, and a randomly placed, but spatially colocated patch of 64 64 pixels was cropped out of each frame pair. The order of the frame pairs was then randomized in a running buffer, and all RGB pixel values were normalized to the range between 0 and 1 before being fed into the model.


ATraining Regime

Neural Information Processing Systems

A.1 Implementation of the GPs We use the GPyTorch4 package for the computations of GPs and their kernels. The NN linear kernel is implemented in all experiments as a 1-layer MLP with ReLU activations and hidden dimension 16. For the Spectral Mixture Kernel, we use 4 mixtures. A.2 Sines Dataset For the first experiments on sines functions, we use the dataset from [9]. For each task, the input points x are sampled from the range [ 5,5], and the target values y are obtained by applying y = Asin(x ')+, where the amplitude A and phase ' are drawn uniformly at random from ranges [0.1,5] and [0, ], respectively.


Random Noise Defense Against Query-Based Black-Box Attacks

Neural Information Processing Systems

The query-based black-box attacks have raised serious threats to machine learning models in many real applications. In this work, we study a lightweight defense method, dubbed Random Noise Defense (RND), which adds proper Gaussian noise to each query. We conduct the theoretical analysis about the effectiveness of RND against query-based black-box attacks and the corresponding adaptive attacks. Our theoretical results reveal that the defense performance of RND is determined by the magnitude ratio between the noise induced by RND and the noise added by the attackers for gradient estimation or local search. The large magnitude ratio leads to the stronger defense performance of RND, and it's also critical for mitigating adaptive attacks. Based on our analysis, we further propose to combine RND with a plausible Gaussian augmentation Fine-tuning (RND-GF). It enables RND to add larger noise to each query while maintaining the clean accuracy to obtain a better trade-off between clean accuracy and defense performance. Additionally, RND can be flexibly combined with the existing defense methods to further boost the adversarial robustness, such as adversarial training (AT). Extensive experiments on CIFAR-10 and ImageNet verify our theoretical findings and the effectiveness of RND and RND-GF.



Supplementary Material ATrainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image Restoration AImplementation details

Neural Information Processing Systems

In this section, we provide additional implementation details, which are useful to reproduce our experiments (note that the code is also provided). For each band i J0,c 1K, the standard deviation of the Gaussian noise is defined as: ฯƒi = ฮฒexp " 1 4ฮท2 i c 1 2 A basic centering step is used for each input patch of our model. More precisely, for the first layer, each band of the input hyperspectral image is centered independently prior to patches extraction, and means are added back after decoding. For the second layer, patches are centered independently for each band (and similarly, the means are added back after decoding). Code and patch sizes The hyperparameters of our model are presented in Table 1.



derivation of Eqs . 3 and 5

Neural Information Processing Systems

A.1 Derivation of Eq. (3) By expanding Eq. (2) with the definition of ฮตli,t = xli,t ยตli,t, we have: Et = We note that each xli,t influences Et in two ways: (i) it occurs in Eq. (6) explicitly, but (ii) it also determines the values of ยตl 1k,t via Eq. Considering also the special cases of l = Land l = 0, we obtain Eq. (3). We note that ฮธl+1i,j affects the value of the function Et of Eq. (6) by influencing ยตli,t via Eq. Here, we provide further details about training PCNs, useful to reproduce them. Furthermore, we have applied a decay factor of 0.9 to ฮณ, applied each time the energy failed to decrease.



Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation

Neural Information Processing Systems

Distillation in neural networks using only the samples randomly drawn from a Gaussian distribution is possibly the most straightforward solution one can think of for the complex problem of knowledge transfer from one network (teacher) to the other (student). If successfully done, it can eliminate the requirement of teacher's training data for knowledge distillation and avoid often arising privacy concerns in sensitive applications such as healthcare. There have been some recent attempts at Gaussian noise-based data-free knowledge distillation, however, none of them offer a consistent or reliable solution. We identify the shift in the distribution of hidden layer activation as the key limiting factor, which occurs when Gaussian noise is fed to the teacher network instead of the accustomed training data. We propose a simple solution to mitigate this shift and show that for vision tasks, such as classification, it is possible to achieve a performance close to the teacher by just using the samples randomly drawn from a Gaussian distribution.


Appendices: Score-based Source Separation with Applications to Digital Communication Signals

Neural Information Processing Systems

During source separation, this presumably results in a noisier estimate of the SOI in comparison to mixtures with no additional background noise. We estimated the SNR to be 16.9dB by averaging across multiple samples. The dotted black curve on the left of Figure G.2 is a presumable lower bound on the BER by accounting for the magnitude of the background noise and modeling it as additive white Gaussian noise.